Using MAP estimated parameters to improve HMM speech recognition performance
نویسندگان
چکیده
RECOGNITION PERFORMANCE Yoshihiko Gotoh 1 Michael M. Hochberg 2 Harvey F. Silverman 1 1 LEMS, Division of Engineering, Brown University, Providence, RI 02912 USA 2 Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 1PZ UK ABSTRACT Hidden Markov models (HMMs) have been quite successfully applied to speech recognition tasks, but many unsolved problems still remain. HMMs do not directly model all phenomena that might be useful for recognition. This is the case, for example, for duration modeling. Mechanisms are needed to incorporate additional information into an HMM system. This paper presents a maximum a posteriori (MAP) parameter estimation approach for improving the state-duration modeling capability and incorporating a priori knowledge about the word-duration distribution into an HMM. The MAP-based approach is evaluated on a talker-independent, connected alphadigit task for various prior distributions on duration. The results | in terms of both computational complexity and recognition performance | are compared with the results of HMM-based systems trained with the traditional maximum-likelihood criterion.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملPresentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition
Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملBayesian affine transformation of HMM parameters for instantaneous and supervised adaptation in telephone speech recognition
This paper proposes a Bayesian affine transformation of hidden Markov model (HMM) parameters for reducing the acoustic mismatch problem in telephone speech recognition. Our purpose is to transform the existing HMM parameters into its new version of specific telephone environment using affine function so as to improve the recognition rate. The maximum a posteriori (MAP) estimation which merges t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994